Needs-based health workforce planning for pregnancy care in Brazil

Introduction

This supplementary material describes the methodological procedures, as well as documents the codes, based on R language, used for the analysis.

We adapted Asamani et al. (2021) orientations in order build a needs-based health workforce planning for Brazilian context.

Asamani et al. (2021) suggest a set of six consideration that researchers should follow when developing needs based workforce planning, which are:

  1. Defining the scope - defining jurisdictional coverage, health system objective and timeframe.

  2. Analysis of the population health service needs - population demography, health status, level of service

  3. Translating the evidence-based service requirement into health workforce requirement - matching health services with competent professionals, eliciting measures of productivity

  4. Exploring resource implications - comparing demand and supply, cost implications)

  5. Conducting sensitivity analysis - perform simulation according to parameter uncertainty or policy scenarios

  6. Conducting model validation - compare to previous experiences and stakeholder consultation

The following flowchart presents the main stages.

Preparing R environment

Before making any progress, we need to load the R libraries that will assist performing our analysis.

library(tidyverse)
library(readxl)
library(geojsonio)
library(broom)
library(jsonlite)
library(RODBC)
library(patchwork)

1 - Defining Scope

The jurisdictional coverage was health regions of Goias state. Goiás is in the middle-west of Brazil. The objective of analysis was targeted to health workforce planning for pregnant and newborn. We employed forecasts for the following two years using artificial intelligence algorithms.

The following figure illustrates Goiás state’s location.

library(geobr)
Warning: package 'geobr' was built under R version 4.2.3
dataset <- list_geobr()

states <- read_state(
  year=2019, 
  showProgress = FALSE
  ) 
Using year 2019
states <- 
  states |> 
  mutate(marcar = factor(if_else(abbrev_state == "GO", "1","2")))

# Remove plot axis
no_axis <- theme(axis.title=element_blank(),
                 axis.text=element_blank(),
                 axis.ticks=element_blank())

# Plot all Brazilian states

cores = c("lightblue","lightgray")

a <- states |> 
      ggplot() +
        geom_sf(aes(fill=marcar), color="#2D3E50", size=.15, show.legend = FALSE) +
        theme_minimal() +
        scale_fill_manual(values = cores) +
        no_axis

a

The following figure illustrates Goiás state’s location. Goiás states has five health regions.

2 - Analysis of population health service needs

Each pregnant will require a set of services during pregnancy and after delivery according to recommendation of Brazilian Minister of Health. The same happens for the baby as soon as it is born.

\[ NHS\underset{r,t}{} = SD\underset{r,t}{} \times (P\underset{r,t}{} \times \sum_{t = 0}^{10}S\underset{y,t}{} + B\underset{r,t}{} \times \sum_{t=9}^{33}S\underset{y,t}{}) \]

where:

  • \(NHS\underset{t}{}\) represents the total programmatic services needed adress each pregnant (P) and Baby (B) of a given health region r in a given time t. The index for pregnants goes from 0 to 10 because each month has programmatic services dedicated to this public. The same happens for the baby, starting from the month it is born.
  • S represents the number of services of type y required for each pregnant or baby in given time t before and after the child birth. The total number of programatic services is listed on appendix A.
  • SD represents a percentage of population which dependent of Brazilian Health Unified System (Sistema Único de Saúde - SUS) in a given health region r and t

2.1. Health Status (P and B) and SUS dependent population (SD)

In the following chunk of script

This chunk of scripts

#reading and selecting variables
predictions <- read_excel("~/GitHub/dimensionamento/05_Gestantes/05_python/previsoes_go.xlsx") |>   
               select(codibge, data, qtd) |> 
               mutate(qtd = round(qtd))

# reading health insurance data
health_insurance <- read_csv("~/GitHub/dimensionamento/05_Gestantes/05_python/beneficiarios_plano_saude.csv") |>  
         filter(mes == "06") |>  
         mutate(perc_sus = (100 - percentual_pop_coberta)/100) |>  
         select(cod_regsaud, perc_sus, regiao_saude) 

# joining both dataframes to deduce the population covered by health insurance
predictions <- 
  predictions |>  
  left_join(health_insurance, by = c("codibge"="cod_regsaud")) |>  
  mutate(qtd_sus = round(qtd * perc_sus)) |> 
  mutate(data = as.character(data))

DT::datatable(predictions)

2.2. Estimating services (S)

We will now quantify the number and variety of services need to assist pregnant women and the newborn in the primary health care (PHC) setting.

For each birth we will calculate a number of service needed before and after the event.

for(i in 1:nrow(predictions)){
  row <- predictions[i,]
  url = paste("http://200.137.215.27:5025/calcula_procedimentos?mes_ano=", 
              substring(row$data, 1, 7), 
              "&nascidos_vivos=", 
              round(row$qtd_sus, 0), sep = '')
  temp <- fromJSON(url)
  temp$data <- row$data
  temp$qtd <- row$qtd_sus
  temp$ibge <- row$codibge
  
  servicos <- rbind(temp, servicos)
  
  print(paste("Chamando:",url))
}
# filter just PHC setting
phc_services <- services |> 
                filter(nivel_atencao == "APS")

DT::datatable(phc_services)

3 - Translating the evidence-based service requirements into health workkforce requirement

Target Population

At first we will access SINASC projections data. These projections were calculated by two co-authors. Multilayer Perceptron (MLP) algorithm registered the best results among other machine learning and statistics technics.

We will deduce the population covered by health insurance.

Total services

PHC Services and professionals

We will filter only services on the PHC setting.

We then will filter only those performed by registered nurses and physicians. We need to join two dataframes: professionals and procedures.

We will transform the number of procedures in number of hours considering past experiences published. Using time motion techniques Bonfim et al. (2014) registered that nursing appointments took about 25,3 minutes (Brazil average) in the PHC setting. Educative actions took about 73 minutes. We will use these number as reference.

# filter just PHC setting
phc_services <- services |> 
                filter(nivel_atencao == "APS")

# loading dataframe which combines procedures and the respective professional
procedures_professional <- read_excel("~/GitHub/dimensionamento/05_Gestantes/05_python/calendario-procedimentos.xlsx", 
    sheet = "procedimentos_profissionais") |> 
                            select(codigo_sigtap, categoria, CBO) |> 
                            mutate(codigo_sigtap = as.numeric(codigo_sigtap)) 

# procedures performed by two cadres will be split equally among them 
# the procedure 301010080 is growth and development consultation. It is divided by 10 because the same procedure id is used for all procedure.  

qtt_professional <- 
  procedures_professional |> 
  group_by(codigo_sigtap) |> 
  count() |> 
  mutate(n = case_when(codigo_sigtap == '301010080' ~ n/10,
                       codigo_sigtap == '101010010' ~ n/2,
                       TRUE ~ n))

# In the following code:
# 1) filter for only procedures performed by nurses and physicians 
# 2) calculate the number of procedures exclusive by cadre according to the last
# code
# 3) filter only appointments and educative actions
# 4) filter only those procedures executed in the year of 2024. 
# 5) create new variable (procedures in hours)

professional_services <- phc_services |> 
                            left_join(qtt_professional, by = "codigo_sigtap") |> 
                            mutate(qtt_cadre = quantidade/n) |> 
                            left_join(procedures_professional, by = "codigo_sigtap") |> 
                            filter(categoria == "Enfermeiro" | categoria == "Médico") |> 
                            filter(tipo_procedimento == "Consultas ou Visitas" |
                                   tipo_procedimento == "Ações Educacionais") |> 
                            mutate(ano = year(mês_procedimento_realizado)) |> 
                            filter(ano == 2024) |> 
                            mutate(tempo = if_else(tipo_procedimento == "Consultas ou Visitas",
                                                   qtt_cadre * 0.66,
                                                   qtt_cadre * 1.266))

DT::datatable(professional_services)

Now we will transform the total demand of procedures into number of 40 hour full-time equivalent professionals.

We will divide the full time required by procedures by 126 which is the total available time for each month by professional.

demanda <- professional_services |> 
        select(ano, mês_procedimento_realizado, ibge, CBO, categoria, tempo) |> 
        mutate(fte40 = tempo/133) |> 
        group_by(ano, mês_procedimento_realizado, ibge, CBO, categoria) |> 
        summarise(tempo_total = sum(tempo),
                  fte40_demanda = sum(fte40),
                  fte40_demanda = round(fte40_demanda, 2))

DT::datatable(demanda)

Calculating Supply

dremio_host <- Sys.getenv("endereco")
dremio_port <- Sys.getenv("port")
dremio_uid <- Sys.getenv("uid")
dremio_pwd <- Sys.getenv("datalake")

channel <- odbcDriverConnect(sprintf("DRIVER=Dremio Connector;HOST=%s;PORT=%s;UID=%s;PWD=%s;AUTHENTICATIONTYPE=Basic Authentication;CONNECTIONTYPE=Direct", dremio_host, dremio_port, dremio_uid, dremio_pwd))

consulta <- 'SELECT * FROM "Analytics Layer".Infraestrutura.Profissionais."Profissionais APS"'

oferta_GO <- sqlQuery(channel, consulta, 
                     as.is = TRUE)

We will deduce the supply considering only the workload performed to assist pregnant and newborn care.

We used a proxy number according to the volume of consultation for pregnant and newborn care in the SISAB.

producao_SISAB <- read_excel("producao_SISAB.xls") |> 
                      select(Cod_Regiao_Saude, Porcentagem)

producao_SISAB$Cod_Regiao_Saude = as.character(producao_SISAB$Cod_Regiao_Saude)

Transforming supply data

Our supply dataset will eventually have duplicate values due to multiple cadre association. In order to overcome this issue, we use a standard of full time equivalent metric.

We multiply the value by four to represent the monthly workload. Next, we transform the number of hours into professional hours, dividing the value by 126 hours. In this way, if we have 3,040 hours of a professional available in a month, it would be the equivalent of having 24 professionals of 40 hours.

We also deduct the total workload dedicated to direct activities (assistance). We used as a reference 60% based on past studies.

We also deduct the percentage dedicated exclusively to activities to assist pregnant women and newborns. As previously presented, we used SISAB data to assess the volume of assistance to this public in comparison to others.

oferta <- oferta_GO |> 
  mutate(cod_regsaud = as.character(cod_regsaud)) |> 
  mutate(ano_mes = ym(COMPETEN)) |> 
  mutate(horas = HORAOUTR + HORAHOSP + HORA_AMB) |> 
  mutate(prof = if_else(substr(CBO, 1, 4) == "2235", "Enfermeiro", "Médico")) |> 
  group_by(uf, cod_regsaud, regiao_saude, prof, ano_mes) |> 
  summarise(horas = 4 * sum(horas)) |> 
  left_join(producao_SISAB, by = c("cod_regsaud"="Cod_Regiao_Saude")) |> 
  mutate(fte40 = horas/133) |> 
  mutate(direto = if_else(prof == "Enfermeiro",
                          fte40 * 0.30,
                          fte40 * 0.39)) |> 
  mutate(liquido = direto * Porcentagem) |> 
  mutate(ano_mes_corrigido = ano_mes + years(2)) |> 
  select(-ano_mes) |> 
  mutate(Porcentagem = round(Porcentagem, 2),
         fte40 = round(fte40, 2),
         direto = round(direto, 2),
         liquido = round(liquido, 2))
`summarise()` has grouped output by 'uf', 'cod_regsaud', 'regiao_saude',
'prof'. You can override using the `.groups` argument.
DT::datatable(oferta)

Comparing demand and supply

We will now compare demand vs supply for each health region. the result metric represents the subtraction between the deduced supply and the demand.

The perc metric represents how much supply is currently available to meet, in percentage terms, the total demand needed in the future (2024).

demanda$ibge <- as.character(demanda$ibge)

demanda_oferta <- 
  demanda |>  
  left_join(oferta, by = c("ibge"="cod_regsaud",
                           "categoria"="prof",
                           "mês_procedimento_realizado"="ano_mes_corrigido")) |> 
  filter(uf != "NA") |> 
  mutate(ano = year(mês_procedimento_realizado)) |> 
  mutate(resultado = liquido - fte40_demanda) |> 
  group_by(ibge, ano, categoria,
           uf, regiao_saude) |> 
  summarise(resultado = sum(resultado),
            demanda = sum(fte40_demanda),
            oferta = sum(liquido))
`summarise()` has grouped output by 'ibge', 'ano', 'categoria', 'uf'. You can
override using the `.groups` argument.
demanda_oferta |> 
  select(-resultado) |> 
  gather(key = "demand_supply", value = "resultado", 
         6:7) |> 
  mutate(categoria = if_else(categoria == "Enfermeiro","Nurse","Physician"),
         demand_supply = if_else(demand_supply == "demanda","Demand","Supply")) |> 
  ggplot(aes(x = fct_reorder(regiao_saude,resultado), y = resultado, fill = demand_supply)) + 
  geom_col(position = "dodge") + coord_flip() + facet_wrap(~categoria) + 
  theme_minimal() + xlab("Region") + ylab("Demand vs Supply") +
  theme(legend.title= element_blank())

demanda$ibge <- as.character(demanda$ibge)

demanda_oferta <- 
  demanda |>  
  left_join(oferta, by = c("ibge"="cod_regsaud",
                           "categoria"="prof",
                           "mês_procedimento_realizado"="ano_mes_corrigido")) |> 
  filter(uf != "NA") |> 
  mutate(ano = year(mês_procedimento_realizado)) |> 
  mutate(resultado = liquido - fte40_demanda) |> 
  group_by(ibge, ano, categoria,
           uf, regiao_saude) |> 
  summarise(resultado = sum(resultado),
            demanda = sum(fte40_demanda),
            oferta = sum(liquido)) |> 
  mutate(resultado = round(resultado, 2)) |> 
  filter(ano == '2024') |> 
  mutate(percentage = (oferta * 100)/demanda,
         percentage = round(percentage, 2)) |> 
  mutate(id = as.integer(ibge)) |> 
  ungroup() |> 
  select(id, regiao_saude, categoria, resultado, percentage)
`summarise()` has grouped output by 'ibge', 'ano', 'categoria', 'uf'. You can
override using the `.groups` argument.
DT::datatable(demanda_oferta)

Map of health regions

We will now ilustrate the results, in terms of percentage, using a map.

spdf <- geojson_read("shape file regioes saude.json",  what = "sp")
demanda_oferta$id <- as.integer(demanda_oferta$id)

spdf_region <- spdf[ spdf@data$est_id == "52" , ]


spdf_fortified <- sf::st_as_sf(spdf_region)

spdf_fortified |>
  left_join(demanda_oferta, by = c("reg_id"="id")) |>
  mutate(categoria = if_else(categoria == "Médico","Physician","Nurse")) |> 
  ggplot() +
  geom_sf(aes(fill = percentage)) +
  geom_sf_text(aes(label = regiao_saude), size = 2.5) +
  theme_minimal() +
  scale_fill_gradient(low = "#F8766D", high = "#00BA38", n.breaks = 10) +
  facet_wrap(~categoria, nrow = 1) +
  theme(
    axis.line = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) + ggtitle("Supply vs Demand", "Nurses and Physician for Maternal and Newborn Assistance in the Primary Health Care Context")

Sensitivy Analysis

Função para cálculo de cenários

funcao_demanda_oferta <- function(tempo1, tempo2, perc_direto1, perc_direto2) {
  professional_services_2 <- phc_services %>%
    left_join(qtt_professional, by = "codigo_sigtap", suffix = c("_phc", "_qtt")) %>%
    mutate(qtt_cadre = quantidade / n) %>%
    left_join(procedures_professional, by = "codigo_sigtap", suffix = c("_phc", "_procedures")) %>%
    filter(categoria == "Enfermeiro" | categoria == "Médico") %>%
    filter(tipo_procedimento == "Consultas ou Visitas" | tipo_procedimento == "Ações Educacionais") %>%
    mutate(ano = year(mês_procedimento_realizado)) %>%
    filter(ano == 2024) %>%
    mutate(
      tempo = if_else(
        tipo_procedimento == "Consultas ou Visitas",
        qtt_cadre * tempo1, # 20 minutos
        qtt_cadre * tempo2
      )
    )
  
  demanda_2 <- professional_services_2 %>%
    select(ano, mês_procedimento_realizado, ibge, CBO, categoria, tempo) %>%
    mutate(fte40 = tempo / 139) %>%
    group_by(ano, mês_procedimento_realizado, ibge, CBO, categoria) %>%
    summarise(
      tempo_total = sum(tempo),
      fte40_demanda = sum(fte40),
      fte40_demanda = round(fte40_demanda, 2)
    )
  
  oferta_2 <- oferta_GO %>%
    mutate(cod_regsaud = as.character(cod_regsaud)) %>%
    mutate(ano_mes = ym(COMPETEN)) %>%
    mutate(horas = HORAOUTR + HORAHOSP + HORA_AMB) %>%
    mutate(prof = if_else(substr(CBO, 1, 4) == "2235", "Enfermeiro", "Médico")) %>%
    group_by(uf, cod_regsaud, regiao_saude, prof, ano_mes) %>%
    summarise(horas = 4 * sum(horas)) %>%
    left_join(producao_SISAB, by = c("cod_regsaud" = "Cod_Regiao_Saude")) %>%
    mutate(fte40 = horas / 133) %>%
    mutate(direto = if_else(prof == "Enfermeiro",
                          fte40 * perc_direto1,
                          fte40 * perc_direto2)) %>%
    mutate(liquido = direto * Porcentagem) %>%
    mutate(ano_mes_corrigido = ano_mes + years(2)) %>%
    select(-ano_mes) %>%
    mutate(
      Porcentagem = round(Porcentagem, 2),
      fte40 = round(fte40, 2),
      direto = round(direto, 2),
      liquido = round(liquido, 2)
    )
  
  demanda_2$ibge <- as.character(demanda_2$ibge)
  
  demanda_oferta_2 <- demanda_2 %>%
    left_join(
      oferta_2,
      by = c("ibge" = "cod_regsaud", 
             "categoria" = "prof", 
             "mês_procedimento_realizado" = "ano_mes_corrigido"),
      suffix = c("_demanda", "_oferta")
    ) %>%
    filter(uf != "NA") %>%
    mutate(ano = year(mês_procedimento_realizado)) %>%
    mutate(resultado = liquido - fte40_demanda) %>%
    group_by(ibge, ano, categoria, uf, regiao_saude) %>%
    summarise(
      resultado = sum(resultado),
      demanda = sum(fte40_demanda),
      oferta = sum(liquido)
    )


  return(demanda_oferta_2)
}

1 - Baseline scenario

baseline <- demanda_oferta |> 
                mutate(cenario = "baseline")

cenario_1_go <- 
  baseline |> 
  group_by(categoria, cenario) |> 
  summarise(resultado = sum(resultado))
`summarise()` has grouped output by 'categoria'. You can override using the
`.groups` argument.
cenario_1_go
# A tibble: 2 × 3
# Groups:   categoria [2]
  categoria  cenario  resultado
  <chr>      <chr>        <dbl>
1 Enfermeiro baseline   -12774.
2 Médico     baseline   -11776.

2 - Scenario 2 - Increase of Direct Activities

We will increase 10% of direct activities for both professionals

cenario_2 <- funcao_demanda_oferta(0.50, 1.00, 0.30, 0.39) |>  
  mutate(percentage = (oferta * 100)/demanda,
         percentage = round(percentage, 2)) |> 
  mutate(id = as.integer(ibge)) |> 
  select(id, regiao_saude, categoria, resultado, percentage)
Warning in left_join(., procedures_professional, by = "codigo_sigtap", suffix = c("_phc", : Each row in `x` is expected to match at most 1 row in `y`.
ℹ Row 1 of `x` matches multiple rows.
ℹ If multiple matches are expected, set `multiple = "all"` to silence this
  warning.
`summarise()` has grouped output by 'ano', 'mês_procedimento_realizado',
'ibge', 'CBO'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'uf', 'cod_regsaud', 'regiao_saude',
'prof'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'ibge', 'ano', 'categoria', 'uf'. You can
override using the `.groups` argument.
Adding missing grouping variables: `ibge`, `ano`, `uf`
DT::datatable(cenario_2)
cenario_2_go <- 
  cenario_2 |> 
  mutate(cenario = "scenario 2") |> 
  group_by(categoria, cenario) |> 
  summarise(resultado = sum(resultado))
`summarise()` has grouped output by 'categoria'. You can override using the
`.groups` argument.
cenario_2_go
# A tibble: 2 × 3
# Groups:   categoria [2]
  categoria  cenario    resultado
  <chr>      <chr>          <dbl>
1 Enfermeiro scenario 2    -8111.
2 Médico     scenario 2    -7114.

3 - Scenario 3 - Increase of Direct Activities

cenario_3 <- funcao_demanda_oferta(0.66, 1.26, 0.40, 0.49) |>  
  mutate(percentage = (oferta * 100)/demanda,
         percentage = round(percentage, 2)) |> 
  mutate(id = as.integer(ibge)) |> 
  select(id, regiao_saude, categoria, resultado, percentage)
Warning in left_join(., procedures_professional, by = "codigo_sigtap", suffix = c("_phc", : Each row in `x` is expected to match at most 1 row in `y`.
ℹ Row 1 of `x` matches multiple rows.
ℹ If multiple matches are expected, set `multiple = "all"` to silence this
  warning.
`summarise()` has grouped output by 'ano', 'mês_procedimento_realizado',
'ibge', 'CBO'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'uf', 'cod_regsaud', 'regiao_saude',
'prof'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'ibge', 'ano', 'categoria', 'uf'. You can
override using the `.groups` argument.
Adding missing grouping variables: `ibge`, `ano`, `uf`
DT::datatable(cenario_3)
cenario_3_go <- 
  cenario_3 |> 
  mutate(cenario = "scenario 3") |> 
  group_by(categoria, cenario) |> 
  summarise(resultado = sum(resultado))
`summarise()` has grouped output by 'categoria'. You can override using the
`.groups` argument.
cenario_3_go
# A tibble: 2 × 3
# Groups:   categoria [2]
  categoria  cenario    resultado
  <chr>      <chr>          <dbl>
1 Enfermeiro scenario 3   -10650.
2 Médico     scenario 3    -9718.

4 - Scenario 4 - Increase of Direct Activities

cenario_4 <- funcao_demanda_oferta(0.50, 1.00, 0.50, 0.59) |>  
  mutate(percentage = (oferta * 100)/demanda,
         percentage = round(percentage, 2)) |> 
  mutate(id = as.integer(ibge)) |> 
  select(id, regiao_saude, categoria, resultado, percentage)
Warning in left_join(., procedures_professional, by = "codigo_sigtap", suffix = c("_phc", : Each row in `x` is expected to match at most 1 row in `y`.
ℹ Row 1 of `x` matches multiple rows.
ℹ If multiple matches are expected, set `multiple = "all"` to silence this
  warning.
`summarise()` has grouped output by 'ano', 'mês_procedimento_realizado',
'ibge', 'CBO'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'uf', 'cod_regsaud', 'regiao_saude',
'prof'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'ibge', 'ano', 'categoria', 'uf'. You can
override using the `.groups` argument.
Adding missing grouping variables: `ibge`, `ano`, `uf`
DT::datatable(cenario_4)
cenario_4_go <- 
  cenario_4 |> 
  mutate(cenario = "scenario 4") |> 
  group_by(categoria, cenario) |> 
  summarise(resultado = sum(resultado))
`summarise()` has grouped output by 'categoria'. You can override using the
`.groups` argument.
cenario_4_go
# A tibble: 2 × 3
# Groups:   categoria [2]
  categoria  cenario    resultado
  <chr>      <chr>          <dbl>
1 Enfermeiro scenario 4    -5328.
2 Médico     scenario 4    -4461.

5 - Scenario 5 - Increase of Direct Activities

cenario_5 <- funcao_demanda_oferta(0.50, 1.00, 0.60, 0.69) |>  
  mutate(percentage = (oferta * 100)/demanda,
         percentage = round(percentage, 2)) |> 
  mutate(id = as.integer(ibge)) |> 
  select(id, regiao_saude, categoria, resultado, percentage)
Warning in left_join(., procedures_professional, by = "codigo_sigtap", suffix = c("_phc", : Each row in `x` is expected to match at most 1 row in `y`.
ℹ Row 1 of `x` matches multiple rows.
ℹ If multiple matches are expected, set `multiple = "all"` to silence this
  warning.
`summarise()` has grouped output by 'ano', 'mês_procedimento_realizado',
'ibge', 'CBO'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'uf', 'cod_regsaud', 'regiao_saude',
'prof'. You can override using the `.groups` argument.
`summarise()` has grouped output by 'ibge', 'ano', 'categoria', 'uf'. You can
override using the `.groups` argument.
Adding missing grouping variables: `ibge`, `ano`, `uf`
DT::datatable(cenario_5)
cenario_5_go <- 
  cenario_5 |> 
  mutate(cenario = "scenario 5") |> 
  group_by(categoria, cenario) |> 
  summarise(resultado = sum(resultado))
`summarise()` has grouped output by 'categoria'. You can override using the
`.groups` argument.
cenario_5_go
# A tibble: 2 × 3
# Groups:   categoria [2]
  categoria  cenario    resultado
  <chr>      <chr>          <dbl>
1 Enfermeiro scenario 5    -3936.
2 Médico     scenario 5    -3134.

Mapa do cenário 5

We will now ilustrate the results, in terms of percentage, using a map.

spdf <- geojson_read("shape file regioes saude.json",  what = "sp")
cenario_5$id <- as.integer(cenario_5$id)

spdf_region <- spdf[ spdf@data$est_id == "52" , ]


spdf_fortified <- sf::st_as_sf(spdf_region)

x <- spdf_fortified |>
  left_join(cenario_5, by = c("reg_id"="id")) |>
  mutate(categoria = if_else(categoria == "Médico","Physician","Nurse")) |> 
  ggplot() +
  geom_sf(aes(fill = percentage)) +
  geom_sf_text(aes(label = regiao_saude), size = 3.5) +
  theme_minimal() +
  scale_fill_gradient(low = "#F8766D", high = "#00BA38", n.breaks = 10) +
  facet_wrap(~categoria, nrow = 1) +
  theme(
    axis.line = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) + ggtitle("Scenario 5")

x

Juntando tudo

cenarios <- rbind(cenario_1_go,
                  cenario_2_go,
                  cenario_3_go,
                  cenario_4_go,
                  cenario_5_go)


y <- cenarios |> 
  mutate(resultado = round(resultado)) |> 
  ggplot(aes(x = cenario, y = resultado, fill = categoria)) + 
  geom_col(position = "dodge") + 
  geom_text(aes(label = resultado), vjust = 1, position = position_dodge(width = 1)) + theme(legend.position="top") +
  theme_minimal() + xlab("Scenario") + ylab("Result") + 
  theme(text = element_text(size = 25))


y 

y + x

Complementary materials

b <- spdf_fortified |>
  left_join(demanda_oferta, by = c("reg_id"="id")) |>
#  mutate(categoria = if_else(categoria == "Médico","Physician","Nurse")) |> 
  ggplot() +
  geom_sf(fill="lightblue") +
  geom_sf_label(aes(label = regiao_saude), size = 2.8) +
  theme_minimal() +
 # scale_fill_gradient(low = "#F8766D", high = "#00BA38", n.breaks = 10) +
#  facet_wrap(~categoria, nrow = 1) +
  theme(
    axis.line = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) + ggtitle("Goiás Health Regions")
Warning in sf_column %in% names(g): Each row in `x` is expected to match at most 1 row in `y`.
ℹ Row 1 of `x` matches multiple rows.
ℹ If multiple matches are expected, set `multiple = "all"` to silence this
  warning.
b

library(patchwork)

b | (a + plot_spacer() + plot_spacer())

Predictions (current vs predictions)

previsoes <- read_delim("previsao_vs_atual/Previsão de nascidos por ano.csv", 
    delim = ";", escape_double = FALSE, trim_ws = TRUE) |> 
             mutate(date = as.Date(mes_ano, format = "%d/%m/%Y")) |> 
             gather(key = "type", value = "births", 
                    1:2) 
Rows: 90 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ";"
chr (1): mes_ano
dbl (2): Real, Predicted

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
previsoes |> 
    ggplot(aes(x = date, y = births, col = type)) + geom_line(size = 1.25) +
  theme_minimal() + theme(text = element_text(size = 25)) 
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
Warning: Removed 54 rows containing missing values (`geom_line()`).

figura

options(scipen = 999)

professional_services |> 
  select(procedimento, data, categoria, tempo) |> 
  group_by(procedimento) |> 
  summarise(workload = sum(tempo)) |> 
  ggplot(aes(x = fct_reorder(procedimento, workload), y = workload)) + 
  geom_col() + coord_flip() + theme_minimal() + xlab("Activities") +
  ylab("Workload for 2024 (in hours)") +
  theme(text = element_text(size = 25))

Looking for some reasons for discussion

cobertura <- read_delim("cobertura.csv", 
    delim = ";", escape_double = FALSE, trim_ws = TRUE)
cobertura_demanda_oferta <- 
  cobertura |> 
  select(CO_CIR, NO_REGIAO_SAUDE, PC_COBERTURA_AB, PC_COBERTURA_SF) |> 
  left_join(demanda_oferta, by = c("CO_CIR"="id"))
cobertura_demanda_oferta |> 
  filter(categoria == "Enfermeiro") |> 
  ggplot(aes(PC_COBERTURA_AB, percentage)) + geom_smooth(method = 'lm', se = FALSE) +
  geom_label(aes(label = NO_REGIAO_SAUDE)) + theme_minimal()

cobertura_demanda_oferta |> 
  filter(categoria == "Médico") |> 
  ggplot(aes(PC_COBERTURA_AB, percentage)) + geom_smooth(method = 'lm', se = FALSE) +
  geom_label(aes(label = NO_REGIAO_SAUDE)) + theme_minimal() 

##Human Development Index

hierarquia_municipios <- read_excel("hierarquia_municipios.xlsx", 
    col_types = c("text", "text", "numeric", 
        "text", "text", "text", "numeric", 
        "text", "text", "numeric", "text", 
        "text", "text", "numeric", "text", 
        "text", "numeric", "numeric"))

idh <- read_delim("idh.csv", delim = ";", 
    escape_double = FALSE, col_types = cols(cod_ibge = col_character()), 
    trim_ws = TRUE) |> 
      mutate(ibge = substr(cod_ibge, 1, 6)) |> 
      left_join(hierarquia_municipios, by = c("ibge"="cod_municipio")) |> 
      group_by(cod_regsaud) |> 
      summarise(idh_medio = mean(idh_2010))

demanda_oferta |> 
  left_join(idh, by = c("id"="cod_regsaud")) |> 
  ggplot(aes(x = idh_medio, y = percentage)) + 
  geom_label(aes(label = regiao_saude)) +
  geom_smooth(method = "lm", se = FALSE ) + facet_wrap(~categoria)
`geom_smooth()` using formula = 'y ~ x'